Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • David 57 posts 80 karma points
    Oct 28, 2010 @ 22:47
    David
    0

    Sitemap: http://{HTTP_HOST}/sitemap

    New to dealing with robots.txt files.  Is this a line I can just add to the robots.txt or can you provide a sample of what this would look like.  I haven't been able to find an example using Sitemap: in any robot to verify syntax.

  • Sebastiaan Janssen 5045 posts 15476 karma points MVP admin hq
    Nov 01, 2010 @ 08:40
    Sebastiaan Janssen
    0

    Yes, I investigated what the syntax should be. Install this package and put this in your robots.txt file:

    Sitemap: http://{HTTP_HOST}/sitemap

    It will replace {HTTP_HOST} with your site url, have a look at http://www.cultiv.nl/robots.txt for an example.

  • David 57 posts 80 karma points
    Nov 01, 2010 @ 13:27
    David
    1

    Thank you for the quick reply.  I figured it out late Friday and should have replied back.  Thanks for confirming it though and thanks for the AWESOME package for Umbraco.  It works great.

     

  • Sebastiaan Janssen 5045 posts 15476 karma points MVP admin hq
    Nov 01, 2010 @ 13:47
    Sebastiaan Janssen
    0

    No problem, and you're very welcome! :-)

  • KK 35 posts 55 karma points
    Jun 29, 2011 @ 05:55
    KK
    0

    hello,

    i have intalled this package and added this line"

    Sitemap: http://{HTTP_HOST}/sitemap"

    in my robots.txt file. but now how can i verify that it will replace "{HTTP_HOST}" with my current website url?

    Can you please guide me?


  • Sebastiaan Janssen 5045 posts 15476 karma points MVP admin hq
    Jun 29, 2011 @ 10:09
    Sebastiaan Janssen
    0

    Go to http://yoursite/robots.txt and see that the string has magically been replaced :)

  • KK 35 posts 55 karma points
    Jun 29, 2011 @ 14:07
    KK
    0

    Hello,

    Yes i have also checked on this way ..my website url is http://u1test.dadabhagwan.org.peppermint.arvixe.com/robots.txt

    but not getting " http://{HTTP_HOST}/sitemap" instead of "http://u1test.dadabhagwan.org.peppermint.arvixe.com/robots.txt"

    so what is the problem?

  • David 57 posts 80 karma points
    Jun 29, 2011 @ 14:15
    David
    0

    When I use this

    http://u1test.dadabhagwan.org.peppermint.arvixe.com/robots.txt

    I get what I expect to see, the directives of what directories search engines can crawl

    When I use this

    http://u1test.dadabhagwan.org.peppermint.arvixe.com/sitemap

    I get the XML sitemap listing I'd expect to see.  It appears to work correctly.

     

     

  • Sebastiaan Janssen 5045 posts 15476 karma points MVP admin hq
    Jun 29, 2011 @ 14:32
    Sebastiaan Janssen
    0

    Looks like the installer couldn't update your web.config, try adding these keys:

    In system.webservers/handlers:

    <remove name="Cultiv.DynamicRobots" />
    <add name="Cultiv.DynamicRobots" path="robots.txt" verb="*" type="Cultiv.DynamicRobots.RobotsTxt, Cultiv.DynamicRobots" preCondition="integratedMode" />

    And/or in system.web/httpHandlers:

    <add verb="*" path="robots.txt" type="Cultiv.DynamicRobots.RobotsTxt, Cultiv.DynamicRobots" />
  • KK 35 posts 55 karma points
    Jun 29, 2011 @ 15:59
    KK
    0

    Hello,

    this line "<add verb="*" path="robots.txt" type="Cultiv.DynamicRobots.RobotsTxt, Cultiv.DynamicRobots" />' is already added in web.config file so should i add first line i.e. "<remove name="Cultiv.DynamicRobots" /><add name="Cultiv.DynamicRobots" path="robots.txt" verb="*" type="Cultiv.DynamicRobots.RobotsTxt, Cultiv.DynamicRobots" preCondition="integratedMode" />" ?

    What do u say?

  • Sebastiaan Janssen 5045 posts 15476 karma points MVP admin hq
    Jun 29, 2011 @ 16:03
    Sebastiaan Janssen
    0

    Yes, if you're running IIS7+ you'll be needing the first one. Shame that it didn't get inserted, I need to review the installer I think.

  • KK 35 posts 55 karma points
    Jun 29, 2011 @ 17:48
    KK
    0

    Hello,

    I have added first line also in my web.config file but still it's not working.

    is there any solution for the same?

  • KK 35 posts 55 karma points
    Jul 08, 2011 @ 16:26
    KK
    0

    hello,

    is there any solution for this problem?

  • KK 35 posts 55 karma points
    Jul 20, 2011 @ 07:43
    KK
    0

    Hello,

    I have tried with adding below two lines in web.config file:

    In system.webservers/handlers:

     <add verb="*" path="robots.txt" type="Cultiv.DynamicRobots.RobotsTxt, Cultiv.DynamicRobots" />

    In system.web/httpHandlers:

    <remove name="Cultiv.DynamicRobots" /><add name="Cultiv.DynamicRobots" path="robots.txt" verb="*" type="Cultiv.DynamicRobots.RobotsTxt, Cultiv.DynamicRobots" preCondition="integratedMode" />

    Eventhough i m not getting my site url in robots.txt file. can u guide me what is the problem? What should i do for the same?

    See the url : http://u1test.dadabhagwan.org.peppermint.arvixe.com/robots.txt

     

     

     
  • Sebastiaan Janssen 5045 posts 15476 karma points MVP admin hq
    Jul 20, 2011 @ 09:25
    Sebastiaan Janssen
    0

    Sorry, I don't know what to tell you, the only other problem I can think of is that your robots.txt file is not in C:\HostingSpaces\dadatest\u1test.dadabhagwan.org\wwwroot\robots.txt or your webhosting environment is running in medium trust mode, which would probably prevent you from using this package.

  • KK 35 posts 55 karma points
    Jul 20, 2011 @ 10:58
    KK
    0

    Hello,

    I have checked and my robots.txt file is in the same path which you had told i.e. C:\HostingSpaces\dadatest\u1test.dadabhagwan.org\wwwroot\robots.txt.

    Also my webhost is running in full trust mode then what could be the reason behind the same?

     

  • Sebastiaan Janssen 5045 posts 15476 karma points MVP admin hq
    Jul 20, 2011 @ 16:53
    Sebastiaan Janssen
    0

    I really don't know, this is the full source code, you might be able to debug it:

    using System.IO;
    using System.Web;

    namespace Cultiv.DynamicRobots
    {
        public class RobotsTxt : IHttpHandler
        {
            public void ProcessRequest(HttpContext context)
            {
                context.Response.ContentType = "text/plain";
     
                var robotsTemplate = HttpContext.Current.Server.MapPath(VirtualPathUtility.ToAbsolute("/robots.txt"));


                if (File.Exists(robotsTemplate))
                {
                    var streamReader = File.OpenText(robotsTemplate);
                    var input = streamReader.ReadToEnd();
                    context.Response.Write(input.Replace("{HTTP_HOST}", HttpContext.Current.Request.ServerVariables["HTTP_HOST"]));
                    streamReader.Close();
                    streamReader.Dispose();
                    return;
                }
     

    context.Response.StatusCode = 404;
                return;
            }
     
            public bool IsReusable
            {
                get { return true; }
            }
        }
    }
  • KK 35 posts 55 karma points
    Jul 21, 2011 @ 06:35
    KK
    0

    Hello,

    Can you guide that in which page to write this code and where to upload the same?

     

     

  • Sebastiaan Janssen 5045 posts 15476 karma points MVP admin hq
    Jul 21, 2011 @ 16:31
    Sebastiaan Janssen
    0

    I can't really teach you how to code in C# using Visual Studio..

    Do you actually need this package by the way, do you have more than one domain name for your site?

    It's very very simple code, the only 2 reasons why it would not work (that I can think of) is if the file is not found (which may mean that the robotsTemplate variable is not being filled correctly), or if the HttpHandler is not executed at all. 

    Try putting this file in you /bin folder, overwriting the existing one. Then, try to go to http://u1test.dadabhagwan.org.peppermint.arvixe.com/robots.txt again. If it still doesn't work, have a look in the umbracoLog table, there should be 1 or 2 messages:

    "DynamicRobots is being picked up and used."

    "Error finding robots.txt file at this location: D:\Dev\MySite\robots.txt"

    If you see any of these two messages, then at least the HttpHandler works, if you see the one that starts with "Error" compare the path of the location to the actual path where your robots.txt file is.

  • KK 35 posts 55 karma points
    Jul 26, 2011 @ 06:20
    KK
    0

    No i have only one domain name only...

    I have overwrite dll file  and checked this url http://u1test.dadabhagwan.org.peppermint.arvixe.com/robots.txt but still it's not working and also not getting any error message..

    then what could be the reason?

  • Sebastiaan Janssen 5045 posts 15476 karma points MVP admin hq
    Jul 26, 2011 @ 08:32
    Sebastiaan Janssen
    0

    The error message should be in the umbracoLog table in your database, have a look there.

    Anyway, for a single domain, you don't need this package, just hardcode your domain name in the robots.txt file:

    Sitemap: http://u1test.dadabhagwan.org.peppermint.arvixe.com/sitemap
Please Sign in or register to post replies

Write your reply to:

Draft